131 research outputs found

    Fine-Grain Iterative Compilation for WCET Estimation

    Get PDF
    Compiler optimizations, although reducing the execution times of programs, raise issues in static WCET estimation techniques and tools. Flow facts, such as loop bounds, may not be automatically found by static WCET analysis tools after aggressive code optimizations. In this paper, we explore the use of iterative compilation (WCET-directed program optimization to explore the optimization space), with the objective to (i) allow flow facts to be automatically found and (ii) select optimizations that result in the lowest WCET estimates. We also explore to which extent code outlining helps, by allowing the selection of different optimization options for different code snippets of the application

    A Novel Approach for Ultra Low-Power WSN Node Generation

    Get PDF
    International audienceWireless Sensor Network (WSN) technology is now emerging with appli- cations in various domains of human life e.g. medicine, environmental monitoring and military surveillance etc. WSN systems consist of low-cost and low-power sensor nodes that communicate efficiently over short distances. It has been shown that power con- sumption is the biggest design constraint for such systems. Currently, WSN nodes are being designed using low-power microcontrollers. However, their power dissipation is still orders of magnitude too high and limits the wide-spreading of WSN technology. In this paper, we propose an alternative approach that uses hardware specialization and power-gating to generate distributed hardware micro-tasks. We target control-oriented tasks running on WSN nodes and present, as a case study, a lamp-switching applica- tion. Our approach is validated experimentally and shows prominent power gains over software implementation on a low-power microcontroller such as the MSP430

    Architectures de contrôleurs ultra-faible consommation pour noeuds de réseau de capteurs sans fil

    Get PDF
    National audienceCet article traite de la conception d'architectures de contrôle pour les noeuds d'un réseau de capteurs. En utilisant conjointement la spécialisation du matériel pour réduire la consommation dynamique et la coupure d'alimentation pour les phases de veille, nous proposons un paradigme d'architecture original ainsi que son flot de conception fonctionnel depuis des spécifications de haut-niveau (langage C associé à un langage spécifiquement conçu). Nous illustrons les gains apportés par un flot complet de génération de micro-tâches matérielles par rapport à des implantations logicielles classiques ciblant des micro-contrôleurs. En combinant la spécialisation matérielle avec des techniques de réduction de puissance statique (power gating), nous réduisons de façon très significative la puissance globale (et l'énergie) dissipée par le système. Les résultats sur des benchmarks issus du domaine des réseaux de capteurs montrent des gains en énergie allant jusqu'à deux ordres de grandeur par rapport aux meilleurs micro-contrôleurs faible consommation du domaine

    Spéculation temporelle algorithmique pour accélérateurs de réseaux de neuro

    Get PDF
    In this paper, we propose a technique for improving the efficiency of hardwareaccelerators based on timing speculation (overclocking) and fault tolerance. We augment theaccelerator with a lightweight error detection mechanism to protect against timing errors, enablingaggressive timing speculation. We demonstrate the validity of our approach for the convolutionlayers in Convolutional Neural Networks (CNN). We present an implementation of a fault-tolerantCNN accelerator combined with the lightweight error detection for convolution layers. The errordetection mechanism we have developed works at the algorithm level, based on algebraic propertiesof the computation, allowing the full implementation to be realized using High-Level Synthesistools. We use a set of Zybo boards to experimentally demonstrate that overclocking boosts thefrequency by 17-36% with low chances of error, and that the infrequent errors can be detected witha negligible overhead (only 1000 LUTs)

    Aggressive Memory Speculation in HW/SW Co-Designed Machines

    Get PDF
    International audienceSingle-ISA heterogeneous systems (such as ARM big.LITTLE) are an attractive solution for embedded platforms as they expose performance/energy trade-offs directly to the operating system. Recent works have demonstrated the ability to increase their efficiency by using VLIW cores, supported through Dynamic Binary Translation (DBT) to maintain the illusion of a single-ISA system. However, VLIW cores cannot rival with Outof- Order (OoO) cores when it comes to performance, mainly because they do not use speculative execution. In this work, we study how it is possible to use memory dependency speculation during the DBT process. Our approach enables fine-grained speculation optimizations thanks to a combination of hardware and software. Our results show that our approach leads to a geo-mean speed-up of 10% at the price of a 7% area overhead

    Architectures de contrôleurs ultra-faible consommation pour noeuds de réseau de capteurs sans fil

    Get PDF
    National audienceCet article traite de la conception d'architectures de contrôle pour les noeuds d'un réseau de capteurs. En utilisant conjointement la spécialisation du matériel pour réduire la consommation dynamique et la coupure d'alimentation pour les phases de veille, nous proposons un paradigme d'architecture original ainsi que son flot de conception fonctionnel depuis des spécifications de haut-niveau (langage C associé à un langage spécifiquement conçu). Nous illustrons les gains apportés par un flot complet de génération de micro-tâches matérielles par rapport à des implantations logicielles classiques ciblant des micro-contrôleurs. En combinant la spécialisation matérielle avec des techniques de réduction de puissance statique (power gating), nous réduisons de façon très significative la puissance globale (et l'énergie) dissipée par le système. Les résultats sur des benchmarks issus du domaine des réseaux de capteurs montrent des gains en énergie allant jusqu'à deux ordres de grandeur par rapport aux meilleurs micro-contrôleurs faible consommation du domaine

    Accelerating HMMER on FPGA using Parallel Prefixes and Reductions

    Get PDF
    HMMER is a widely used tool in bioinformatics, based on Profile Hidden Markov Models. The computation kernels of HMMER i.e. MSV and P7Viterbi are very compute intensive and data dependencies restrict to sequential execution. In this paper, we propose an original parallelization scheme for HMMER by rewriting their mathematical formulation, to expose the hidden potential parallelization opportunities. Our parallelization scheme targets FPGA technology, and our architecture can achieve 10 times speedup compared with that of latest HMMER3 SSE version, while not compromising on sensitivity of original algorithm.HMMER est un outil basé sur la notion profils à base modèles de Markov cachés, qui est très largement utilisé en bio-informatique. Les parties critiques de l'algorithme (fonctions MSV et P7Viterbi) utilisées dans HMMER sont très consommatrices en temps de calcul et réputées très difficiles à paralléliser. Dans cet article, nous proposons un schéma de parallélisation original pour HMMER, basé sur une reformulation mathématique de l'algorithme qui permet de découvrir de nouvelles possibilités de parallélisation bien adaptées à des implantations matérielles dédiées. Nous avons implanté cette approche sur un accélérateur FPGA et avons mesuré des gains en performance supérieurs à 10 par rapport à l'implémentation logicielle de HMMER3, laquelle exploite pourtant déjà de manière extrêmement efficace les extensions SIMD des processeurs x8

    Using Model Types to Support Contract-Aware Model Substitutability

    Get PDF
    International audienceModel typing brings the benefit associated with well-defined type systems to model-driven development (MDD) through the assignment of specific types to models. In particular, model type systems enable reuse of model manipulation operations (e.g., model transformations), where manipulations defined for models of a supertype can be used to manipulate models of subtypes. Existing model typing approaches are limited to structural typing defined in terms of object-oriented metamodels (e.g., MOF) in which the only structural (well-formedness) constraints are those that can be expressed directly in metamodeling notations (e.g., multiplicity and element containment constraints). In this paper we describe an extension to model typing that takes into consideration structural invariants, other than those that can be expressed directly in metamodeling notation, and specifications of behaviors associated with model types. The approach supports contract-aware substitutability, where contracts are defined in terms of invariants and pre-/postconditions expressed using OCL. Support for behavioral typing paves the way for behavioral substitutability. We also describe a technique to rigorously reason about model type substitutability as supported by contracts and apply the technique in use cases from the optimizing compiler community

    Model-Driven Engineering and Optimizing Compilers: A bridge too far?

    Get PDF
    International audienceA primary goal of Model Driven Engineering (MDE) is to reduce the cost and effort of developing complex software systems using techniques for transforming abstract views of software to concrete implementations. The rich set of tools that have been developed, especially the growing maturity of model transformation technologies, opens the possibility of applying MDE technologies to transformation-based problems in other domains. In this paper, we present our experience with using MDE technologies to build and evolve compiler infrastructures in the optimizing compiler domain.We illustrate, through our two ongoing research compiler projects for C and a functional language, the challenging aspects of optimizing compiler research and show how mature MDE technologies can be used to address them.We also identify some of the pitfalls that arise from unrealistic expectations of what can be accomplished using MDE and discuss how they can lead to unsuccessful and frustrating application of MDE technologies

    From Scilab to multicore embedded systems: Algorithms and methodologies

    Get PDF
    http://samos-conference.com/Resources_Samos_Websites/Proceedings_Repository_SAMOS/2012/Files/2012-IC-34.pdfWhile advances in processor architecture continues to increase hardware parallelism, parallel software creation is hard. There is an increasing need for tools and methodologies to narrow the entry gap for non-experts in parallel software development as well as to streamline the work for experts. This paper presents the methodology and algorithms for the creation of parallel software written in Scilab source code for multicore embedded processors in the context of the “Architecture oriented paraLlelization for high performance embedded Multicore systems using scilAb” (ALMA) EU FP7 project. The ALMA parallelization approach in a nutshell attempts to manage the complexity of the task by alternating focus between very localized and holistic view program optimization strategies
    corecore